Use ImportResolver for tag imports#8254
Conversation
f67ef57 to
04ab34d
Compare
d5af3e5 to
f050e53
Compare
| // internal name. | ||
| std::vector<Literals> definedGlobals; | ||
| std::vector<std::unique_ptr<RuntimeTable>> definedTables; | ||
| std::vector<Tag> definedTags; |
There was a problem hiding this comment.
Why do we need std::unique_ptr for definedTables but not for definedGLobals or definedTags? Or could we get rid of the unique ptr in a separate change?
There was a problem hiding this comment.
definedTables are unique_ptrs because RuntimeTable is a virtual type. It also makes a little more sense for tables than for globals and tags since tables are more like 'control' while globals and tags are more like plain data (setting aside what we discussed about tags having an identity).
We could potentially remove the indirection by changing it to RealRuntimeTable but it isn't possible right now because we inject a fake RuntimeTable in ctor eval: link. Once we resolve the TODO mentioned there, we could change it. On that note: what's the reasoning for wanting to remove the unique_ptr here? Is it speed or something else?
There was a problem hiding this comment.
Makes sense, thanks. My motivation for asking is that I think it would be nice to be as uniform as possible with how we treat the different kinds of module fields and their corresponding runtime objects here. So I would have expected the vector of defined tags to look very much like the vector of defined tables, i.e. both or neither using unique_ptrs. (This is also why I would expect the tags to be represented by RuntimeTag objects containing pointers to the instantiated Tag definitions, just like RuntimeTable has a pointer to a Table.)
I would go as far as to say that RuntimeTag doesn't need to support an equality operator or do anything except wrap a Tag*. We can tell if two RuntimeTags are the same if they physically have the same address.
There was a problem hiding this comment.
So is you point that because these are all module fields / importable things that they should all either be unique_ptrs or not unique_ptrs? I feel that they have that in common, but what they don't have in common is whether they encode data or behavior. Literals are closer to data and tables/memories/functions are more of behavior. e.g. for tables we disable copy construction and move assignment, but it's fine to do this with a Literal (and we do do that here). Also it's a little strange (although allowed) to do copy assignment on the pointed-at value of a unique_ptr, which the former code link would do if we changed definedGlobals to hold unique_ptrs.
The rule I'm following is that value objects can be held directly, while behavior objects disable copying/assignment and are held with unique_ptrs. From that perspective it's fine if globals aren't held in a unique_ptr.
I didn't mention tags because they're in an in-between state in this PR. Here I'm using a 'data' object to represent them, but it would mostly be wrong to copy them since it would change its identity. In the next PR I plan to add a type for them that can't be copied, or maybe I'd represent them with unique_ptrs instead.
There was a problem hiding this comment.
In my mental model, these are all just representations of things held in the WebAssembly store, so they all just hold data. As an additional object oriented C++-ism, we may also give them methods instead of operating on them with free functions, but I don't see the data/behavior distinction you're describing.
This mental model also explains why I think it's natural to determine whether two runtime tags are the same by comparing their addresses (i.e. unique_ptr values). The formal semantics also just compares the addresses of tags when determining whether a catch handler applies.
There was a problem hiding this comment.
(although I noticed that the spec has a bug here 🤦 WebAssembly/spec#2076)
There was a problem hiding this comment.
Backing up it sounds like you'd want one of these two options?
std::vector<std::unique_ptr<Literals>> definedGlobals;
std::vector<std::unique_ptr<RuntimeTable>> definedTables;
// ...etcor
std::vector<Literals> definedGlobals;
std::vector<RealRuntimeTable> definedTables;
// ...etcI think option 1 is bad because it forces an extra indirection for globals that we don't need, and 2 is bad because:
- It requires us to use the concrete class instead of the interface, which is relevant because we have cases that want to overload the handling of tables (ctor-eval does this, and the fuzzer does not but does use ModuleRunner directly so it wouldn't be able to overload a table's behavior without also subclassing ModuleRunnerBase if we use RealRuntimeTable here).
- It requires RealRuntimeTable to be movable, and I think it shouldn't be. It can be made movable but ideally tables should be created once and then only referenced, which also matches Wasm's semantics better. We can avoid the need for it to be movable by using std::deque instead, but I think unique_ptr is still more convenient for non-movable types, e.g. a deque can't delete from the middle with non-moveable types.
There was a problem hiding this comment.
Yeah, I think maintaining the ability to have users of the interpreter provide their own implementations of tables (and maybe memories, etc.) is important, so I would go with using the unique_ptrs. I don't think the extra indirection will make much of a difference in practice, since the import resolver would be giving out raw pointers or references to Literals or RuntimeTable either way.
... but the decision reduces to determining whether adding an unnecessary unique_ptr around the Literals is worth the uniformity of having the unique_ptr for everything. I guess I don't feel too strongly about it, so the current approach sgtm too.
|
The fuzzer is reporting Do you need a testcase @stevenfontanella , or is that error message already clear enough? |
Fix for [fuzzer-detected crash when ctor-eval runs on a module that imports a tag](#8254 (comment)). Prior to #8254, ctor-eval would [crash](https://github.com/WebAssembly/binaryen/blob/23d218d0bd469a399ff17b26fdd71164beeb63fa/src/tools/wasm-ctor-eval.cpp#L396) when an imported tag was evaluated, but not when imported. Change the code to allow imported tags even during evaluation. Note that we can't reason about the identity of imported tags. In the following code, $t1 and $t2 may be the same or different tags: ```wasm (import "foo" "bar" (tag $t1)) (import "foo" "bar2" (tag $t2)) ``` In this PR, we assume that $t1 and $t2 are different tags, and that they're the same tag if the import name is the same (this is also not true in general, the hosting environment may provide two different values for the same exact import name). This may cause some correctness issues. As a followup, we can make equality comparison of two imported tags throw FailToEvalException to make evaluation correct. Part of #8180.

Part of #8180 and #8261. Fixes the semantics/spec test when the same tag is imported in different instances, in which case the tag should behave as a new identity, which was previously not the case (see the tags in the modified instance.wast in this PR).